[SPARK-29203][SQL][TESTS] Reduce shuffle partitions in SQLQueryTestSuite#25891
[SPARK-29203][SQL][TESTS] Reduce shuffle partitions in SQLQueryTestSuite#25891wangyum wants to merge 6 commits intoapache:masterfrom wangyum:SPARK-29203
Conversation
sql/core/src/test/scala/org/apache/spark/sql/test/SharedSparkSession.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/test/SharedSparkSession.scala
Outdated
Show resolved
Hide resolved
|
cc @gatorsmile and @cloud-fan too |
|
Test build #111153 has finished for PR 25891 at commit
|
|
retest this please |
|
@wangyum if tests fail and are tricky to fix, let's just only fix SQLQueryTestSuite for now since that takes longest time. |
|
Test build #111158 has finished for PR 25891 at commit
|
|
Test build #111159 has finished for PR 25891 at commit
|
| -- !query 4 output | ||
| 1 10 val3b 8 NULL | ||
| 1 10 val1b 8 16 | ||
| 1 10 val3b 8 NULL |
There was a problem hiding this comment.
Ur, do we really need this?
There was a problem hiding this comment.
Yes.
[info] - subquery/in-subquery/in-joins.sql *** FAILED *** (15 seconds, 359 milliseconds)
[info] subquery/in-subquery/in-joins.sql
[info] Expected "1 10 val[3b 8 NULL
[info] 1 10 val1b 8 16]
[info] 1 10 val3a 6 12
[info] 1 8...", but got "1 10 val[1b 8 16
[info] 1 10 val3b 8 NULL]
[info] 1 10 val3a 6 12
[info] 1 8..." Result did not match for query #4
[info] SELECT Count(DISTINCT(t1a)),
[info] t1b,
[info] t3a,
[info] t3b,
[info] t3c
[info] FROM t1 natural left JOIN t3
[info] WHERE t1a IN
[info] (
[info] SELECT t2a
[info] FROM t2
[info] WHERE t1d = t2d)
[info] AND t1b > t3b
[info] GROUP BY t1a,
[info] t1b,
[info] t3a,
[info] t3b,
[info] t3c
[info] ORDER BY t1a DESC, t3b DESC (SQLQueryTestSuite.scala:383)
[info] org.scalatest.exceptions.TestFailedException:
There was a problem hiding this comment.
Got it. It seems that we are hitting the corner case because the query has a sort on a subset of columns.
def isSorted(plan: LogicalPlan): Boolean = plan match {
case _: Join | _: Aggregate | _: Generate | _: Sample | _: Distinct => false
case _: DescribeCommandBase
| _: DescribeColumnCommand
| _: DescribeTableStatement
| _: DescribeColumnStatement => true
case PhysicalOperation(_, _, Sort(_, true, _)) => true
case _ => plan.children.iterator.exists(isSorted)
}|
|
||
| override def sparkConf: SparkConf = super.sparkConf | ||
| // Reduce shuffle partitions to reduce testing time. | ||
| .set(SQLConf.SHUFFLE_PARTITIONS, 5) |
There was a problem hiding this comment.
For Python UDF test, this seems to increase from 4 to 5. Did I understand correctly?
There was a problem hiding this comment.
OK. I'll try to set it to 4. This is because it is set to 5 in two places:
spark/sql/hive/src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala
Lines 613 to 620 in 42b80ae
There was a problem hiding this comment.
=4 has another issue: --SET spark.sql.autoBroadcastJoinThreshold=10485760 will success, but --SET spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=true will failed:
22:31:31.233 ERROR org.apache.spark.sql.SQLQueryTestSuite: Error using configs: spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=true,spark.sql.codegen.wholeStage=true,spark.sql.codegen.factoryMode=CODEGEN_ONLY
[info] - subquery/in-subquery/not-in-joins.sql *** FAILED *** (32 seconds, 609 milliseconds)
[info] subquery/in-subquery/not-in-joins.sql
[info] Expected "1 16 12 [21
[info] 1 16 12 10]
[info] 1 10 NULL 12
[info] 1 6 8 ...", but got "1 16 12 [10
[info] 1 16 12 21]
[info] 1 10 NULL 12
[info] 1 6 8 ..." Result did not match for query #6
[info] SELECT Count(DISTINCT( t1a )),
[info] t1b,
[info] t1c,
[info] t1d
[info] FROM t1
[info] WHERE t1a NOT IN (SELECT t2a
[info] FROM t2
[info] JOIN t1
[info] WHERE t2b <> t1b)
[info] GROUP BY t1b,
[info] t1c,
[info] t1d
[info] HAVING t1d NOT IN (SELECT t2d
[info] FROM t2
[info] WHERE t1d = t2d)
[info] ORDER BY t1b DESC (SQLQueryTestSuite.scala:383)
There was a problem hiding this comment.
Can we add an ORDER BY to make the query output deterministic?
|
Thank you for taking care of this. In general, I agree with the idea. However, I left two comments about the result changes and the side-effect on Python UDF tests. It seems that we need to revise this PR more to achieve the original idea correctly. |
|
I updated the PR description too (from 5 to 4). |
|
Test build #111172 has finished for PR 25891 at commit
|
|
LGTM if tests pass |
|
Test build #111185 has finished for PR 25891 at commit
|
|
retest this please |
|
It should be pass after this commit: |
|
Test build #111202 has finished for PR 25891 at commit
|
|
retest this please |
|
Test build #111207 has finished for PR 25891 at commit
|
|
Test build #111214 has finished for PR 25891 at commit
|
|
Test build #111218 has finished for PR 25891 at commit
|
|
Thank you all! |
|
Merged to master. |
|
+1, Late LGTM. |
|
BTW, @wangyum . Can we have this in |
|
Reducing the time looks super nice... |
…estSuite ### What changes were proposed in this pull request? This PR backport #25891 to `branch-2.4`. ### Why are the changes needed? Reduce testing time. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Manually tested in my local: Before: ``` ... [info] - subquery/in-subquery/in-joins.sql (6 minutes, 19 seconds) [info] - subquery/in-subquery/not-in-joins.sql (2 minutes, 17 seconds) [info] - subquery/scalar-subquery/scalar-subquery-predicate.sql (45 seconds, 763 milliseconds) ... Run completed in 1 hour, 22 minutes. ``` After: ``` ... [info] - subquery/in-subquery/in-joins.sql (1 minute, 12 seconds) [info] - subquery/in-subquery/not-in-joins.sql (27 seconds, 541 milliseconds) [info] - subquery/scalar-subquery/scalar-subquery-predicate.sql (17 seconds, 360 milliseconds) ... Run completed in 47 minutes. Closes #25938 from wangyum/SPARK-29203-branch-2.4. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Yuming Wang <wgyumg@gmail.com>
What changes were proposed in this pull request?
This PR reduce shuffle partitions from 200 to 4 in
SQLQueryTestSuiteto reduce testing time.Why are the changes needed?
Reduce testing time.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Manually tested in my local:
Before:
After: